Skip to content

Conversation

@prmoore77
Copy link
Contributor

Closes #733

Thank You for Your Contribution!

We appreciate your effort and contribution to the project. To ensure that your Pull Request (PR) adheres to our guidelines, please ensure to review the rules mentioned in our contribution guidelines:

ClickHouse/ClickBench Contribution Rules

Thank you for your attention to these details and for helping us maintain the quality and integrity of the project.

@CLAassistant
Copy link

CLAassistant commented Dec 31, 2025

CLA assistant check
All committers have signed the CLA.

@prmoore77 prmoore77 changed the title Completed GizmoSQL c6a.4xlarge benchmark Adding GizmoSQL Dec 31, 2025
@prmoore77 prmoore77 changed the title Adding GizmoSQL Add GizmoSQL Dec 31, 2025
@george-larionov george-larionov self-assigned this Jan 9, 2026
@george-larionov
Copy link
Member

Hi @prmoore77, thanks for your submission! However, I am having trouble getting it to run. When I run benchmark.sh I get the following output when creating the table (line 33):

Jan 10, 2026 6:18:08 PM org.apache.arrow.driver.jdbc.shaded.org.apache.arrow.flight.auth2.ClientHandshakeWrapper doClientHandshake
SEVERE: Failed on completing future
org.apache.arrow.driver.jdbc.shaded.org.apache.arrow.flight.FlightRuntimeException: UNAVAILABLE: io exception
        at org.apache.arrow.driver.jdbc.shaded.org.apache.arrow.flight.CallStatus.toRuntimeException(CallStatus.java:121)
        at org.apache.arrow.driver.jdbc.shaded.org.apache.arrow.flight.grpc.StatusUtils.fromGrpcRuntimeException(StatusUtils.java:161)
        at org.apache.arrow.driver.jdbc.shaded.org.apache.arrow.flight.grpc.StatusUtils.fromThrowable(StatusUtils.java:182)
        at org.apache.arrow.driver.jdbc.shaded.org.apache.arrow.flight.auth2.ClientHandshakeWrapper.doClientHandshake(ClientHandshakeWrapper.java:55)
        at org.apache.arrow.driver.jdbc.shaded.org.apache.arrow.flight.FlightClient.handshake(FlightClient.java:202)
        at org.apache.arrow.driver.jdbc.client.utils.ClientAuthenticationUtils.getAuthenticate(ClientAuthenticationUtils.java:107)
        at org.apache.arrow.driver.jdbc.client.utils.ClientAuthenticationUtils.getAuthenticate(ClientAuthenticationUtils.java:92)
        at org.apache.arrow.driver.jdbc.client.ArrowFlightSqlClientHandler$Builder.build(ArrowFlightSqlClientHandler.java:968)
        at org.apache.arrow.driver.jdbc.ArrowFlightConnection.createNewClientHandler(ArrowFlightConnection.java:119)
        at org.apache.arrow.driver.jdbc.ArrowFlightConnection.createNewConnection(ArrowFlightConnection.java:89)
        at org.apache.arrow.driver.jdbc.ArrowFlightJdbcDriver.connect(ArrowFlightJdbcDriver.java:90)
        at org.apache.arrow.driver.jdbc.ArrowFlightJdbcDriver.connect(ArrowFlightJdbcDriver.java:46)
        at sqlline.DatabaseConnection.connect(DatabaseConnection.java:135)
        at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:192)
        at sqlline.Commands.connect(Commands.java:1487)
        at sqlline.Commands.connect(Commands.java:1361)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:569)
        at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:44)
        at sqlline.SqlLine.dispatch(SqlLine.java:818)
        at sqlline.SqlLine.initArgs(SqlLine.java:447)
        at sqlline.SqlLine.begin(SqlLine.java:570)
        at io.gizmosql.sqlline.GizmoSQLLine.main(GizmoSQLLine.java:48)
Caused by: org.apache.arrow.driver.jdbc.shaded.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:31337
Caused by: java.net.ConnectException: Connection refused
        at java.base/sun.nio.ch.Net.pollConnect(Native Method)
        at java.base/sun.nio.ch.Net.pollConnectNow(Net.java:672)
        at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:946)
        at org.apache.arrow.driver.jdbc.shaded.io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:336)
        at org.apache.arrow.driver.jdbc.shaded.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:339)
        at org.apache.arrow.driver.jdbc.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:784)
        at org.apache.arrow.driver.jdbc.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:732)
        at org.apache.arrow.driver.jdbc.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:658)
        at org.apache.arrow.driver.jdbc.shaded.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
        at org.apache.arrow.driver.jdbc.shaded.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:998)
        at org.apache.arrow.driver.jdbc.shaded.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at org.apache.arrow.driver.jdbc.shaded.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:840)

The only change I have made was adding 'sudo' to the docker run command since it wasn't working for me otherwise. Any ideas?

@prmoore77
Copy link
Contributor Author

Hi @george-larionov - could you share the gizmosql container logs? I'll retry my test also to see if I get the same thing... thanks!

@prmoore77
Copy link
Contributor Author

prmoore77 commented Jan 11, 2026

Hi @george-larionov - I think I figured it out. In my AWS EC2 provisioning scripts (attached) - I mount the recommended EBS volume to path: /nfs_data on the virtual machine. If that path isn't available, the GizmoSQL Docker container will crash (b/c we use that path to mount in the docker command - here: --mount type=bind,source=/nfs_data,target=/opt/gizmosql/data \).

Could you try with a volume mounted at that path?

Here are the scripts:
ec2_provisioning_scripts.zip

If you would like me to put the provisioning scripts in the codebase, just let me know. I didn't do so, b/c I didn't see that others had done so... Thanks for your help!

@prmoore77
Copy link
Contributor Author

hi @george-larionov - were you able to test?

@george-larionov
Copy link
Member

Hi @prmoore77, sorry for the late reply, it's been a busy week. I tried again after replacing /nfs_data with an existing path, but am still getting the same error, I don't think it is related to the mount path.

The expectation is that the benchmark.sh script can run on a fresh Ubuntu machine on AWS, so please include any specific mounting you need (although perhaps it would be simpler to just use the default mount paths). Take a look at the CedarDB or DuckDB scripts.

Additionally, the way the benchmarks are run is semi-automated using the run-benchmark.sh script, which fills in the cloud-init.sh script with the correct details and then runs it on the AWS instance, so take a look there for the exact way that things are run.

I'm also attaching the log.txt file, is this the one you meant?

Let me know if you have any questions, I will try to be more timely in my replies 😳

"column-oriented",
"arrow-flight-sql",
"duckdb",
"lukewarm-cold-run"
Copy link
Member

@rschu1ze rschu1ze Jan 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see here for more details what warm / lukewarm and cold runs mean in the context of ClickBench.

Two comments related to that.

run.sh contains this:

# Execute all queries in one session (so authentication overhead is minimized)
echo "Running benchmark with $(wc -l < queries.sql) queries, ${TRIES} tries each..."

gizmosqlline \
  -u 'jdbc:arrow-flight-sql://localhost:31337?useEncryption=true&disableCertificateVerification=true' \
  -n clickbench \
  -p clickbench \
  -f "${TEMP_SQL_FILE}"

The script does not clear the OS page cache between query runs - which it should do to qualify as a "lukewarm run". If authentication is costly, then feel free to disable it entirely (the script seems to disable certificate validation already).

The other thing is that we are moving away from lukewarm runs. New submissions ideally do cold runs right from the start. Would it be possible to kill/start the GizmoSQL docker container between query runs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @rschu1ze - would we need to kill/restart the container between each individual query execution?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You would need to do this:

  • kill/restart the container, then run query 1 three times (the first execution is a cold run, the second and third execution are hot runs),
  • kill/restart the container, then run query 2 three times,
  • kill/restart the container, then run query 3 three times,
  • etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification, I appreciate it. Sorry for my confusion - I'm a little new to this...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may switch to just using the gizmosql_server executable directly - to eliminate Docker from the test stack... It will make the installation just a tad more difficult - but not too bad...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification, I appreciate it. Sorry for my confusion - I'm a little new to this...

It is no problem at all! We are always happy to expand the benchmark to new systems.

I may switch to just using the gizmosql_server executable directly - to eliminate Docker from the test stack... It will make the installation just a tad more difficult - but not too bad...

It is totally up to you, some DBs use docker (like CedarDB) while others install directly (like DuckDB).

@prmoore77
Copy link
Contributor Author

hi @george-larionov and @rschu1ze , I have re-factored the benchmarking for GizmoSQL to use direct executables instead of Docker. I also stop/start the server between each query as directed, and I clear the system cache before running the benchmark.

Could you re-review? Thanks!

@prmoore77
Copy link
Contributor Author

prmoore77 commented Jan 26, 2026

hi @george-larionov - have I addressed your concerns? Also - I wasn't clear - do you all run the script (via CI, etc.) for all of the platforms - so scores will be shown for all of them, or do I need to run it for the other machine types?

@george-larionov
Copy link
Member

george-larionov commented Jan 27, 2026

Hi @prmoore77, I successfully reproduced the results locally (🎉 ), but still trying to get it to run with the automated script. I think I need to figure that out on our end but I will keep you updated. Regarding the scores for the other machines, yes, once we get your script running in our automation we will just run for multiple sizes, you don't need to.

@george-larionov
Copy link
Member

george-larionov commented Jan 27, 2026

@prmoore77 here are the results from my run, just FYI (although, I will use the results from our automated run once I get it running for the final submission):
[0.059, 0.009, 0.007],
[0.121, 0.012, 0.01],
[0.155, 0.03, 0.03],
[0.33, 0.046, 0.045],
[0.362, 0.256, 0.259],
[0.829, 0.43, 0.428],
[0.093, 0.014, 0.014],
[0.098, 0.017, 0.015],
[0.51, 0.355, 0.358],
[0.672, 0.511, 0.508],
[0.288, 0.131, 0.129],
[0.287, 0.143, 0.138],
[0.536, 0.39, 0.395],
[0.886, 0.686, 0.686],
[0.597, 0.421, 0.415],
[0.432, 0.314, 0.321],
[1.034, 0.831, 0.839],
[0.782, 0.559, 0.569],
[1.989, 1.547, 1.561],
[0.08, 0.013, 0.012],
[17.398, 0.463, 0.454],
[0.923, 0.463, 0.46],
[11.514, 0.545, 0.596],
[0.53, 0.133, 0.121],
[0.113, 0.039, 0.041],
[0.223, 0.142, 0.141],
[0.114, 0.037, 0.035],
[0.811, 0.343, 0.343],
[12.675, 8.034, 8.005],
[0.138, 0.051, 0.045],
[0.624, 0.341, 0.341],
[2.051, 0.414, 0.421],
[2.006, 1.701, 1.702],
[2.265, 1.654, 1.65],
[2.37, 1.788, 1.767],
[0.511, 0.409, 0.412],
[0.099, 0.048, 0.044],
[0.08, 0.021, 0.019],
[0.083, 0.023, 0.021],
[0.145, 0.08, 0.073],
[0.077, 0.019, 0.017],
[0.078, 0.018, 0.015],
[0.079, 0.023, 0.023],

@george-larionov george-larionov merged commit 6e0f7ed into ClickHouse:main Jan 27, 2026
@george-larionov
Copy link
Member

@prmoore77 I have merged the branch to make it easier for me to debug why the automation isn't working. I will let you know once I update the actual benchmark results and add them to ClickBench.

@prmoore77 prmoore77 deleted the feature/gizmosql branch January 27, 2026 13:29
@prmoore77
Copy link
Contributor Author

hi @george-larionov - are there any issues I can help with regarding the automation and getting the script to work?

@george-larionov
Copy link
Member

george-larionov commented Jan 27, 2026

@prmoore77, not having much luck debugging the issue. It seems that when I run benchmark.sh manually, everything works. But when I try to run via run-benchmark.sh and cloud-init.sh on a new EC2 instance, things don't work. Here is the end of the log generated by cloud-init.sh (before these lines it is doing normal apt stuff):

Archive:  gizmosql.zip
  inflating: gizmosql_server         
  inflating: gizmosql_client         
/tmp /ClickBench/gizmosql
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
^M  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0^M  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
^M  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
^M100 46.1M  100 46.1M    0     0  92.9M      0 --:--:-- --:--:-- --:--:-- 92.9M
/ClickBench/gizmosql
Waiting for gizmosql_server to start...

(hangs here)

And here is the gizmosql_server.log:

2026-01-27T22:25:34.002898Z INFO pid=2733 tid=124537869763712 component=gizmosql_server - GizmoSQL Core - Copyright (c) 2026 GizmoData LLC
 Licensed under the Apache License, Version 2.0
 https://www.apache.org/licenses/LICENSE-2.0
2026-01-27T22:25:34.002933Z INFO pid=2733 tid=124537869763712 component=gizmosql_server - Overall Log Level is set to: info
2026-01-27T22:25:34.006338Z INFO pid=2733 tid=124537869763712 component=gizmosql_server - Query timeout (in seconds) is set to: 0 (unlimited)
2026-01-27T22:25:34.006368Z INFO pid=2733 tid=124537869763712 component=gizmosql_server - Query Log Level is set to: INFO
2026-01-27T22:25:34.006380Z INFO pid=2733 tid=124537869763712 component=gizmosql_server - Authentication Log Level is set to: INFO
2026-01-27T22:25:34.006429Z INFO pid=2733 tid=124537869763712 component=gizmosql_server - ----------------------------------------------
2026-01-27T22:25:34.006663Z INFO pid=2733 tid=124537869763712 component=gizmosql_server - System: Ubuntu 24.04.3 LTS (linux/x86_64)
2026-01-27T22:25:34.006679Z INFO pid=2733 tid=124537869763712 component=gizmosql_server - CPU: AMD EPYC 7R13 Processor (16 cores)
2026-01-27T22:25:34.006731Z INFO pid=2733 tid=124537869763712 component=gizmosql_server - Memory: 30.6 GB
2026-01-27T22:25:34.006744Z INFO pid=2733 tid=124537869763712 component=gizmosql_server - Apache Arrow version: 23.0.0
2026-01-27T22:25:34.006794Z INFO pid=2733 tid=124537869763712 component=gizmosql_server - WARNING - TLS is disabled for the GizmoSQL server - this is NOT secure.
2026-01-27T22:25:34.007930Z INFO pid=2733 tid=124537869763712 component=gizmosql_server - Access logging disabled
2026-01-27T22:25:34.007954Z INFO pid=2733 tid=124537869763712 component=gizmosql_server - DuckDB version: v1.4.4
2026-01-27T22:25:34.066661Z INFO pid=2733 tid=124537869763712 instance_id=729f0a53-13c2-4306-96b3-e3eb9e67d4f5 component=gizmosql_server - Server instance created
2026-01-27T22:25:34.066677Z INFO pid=2733 tid=124537869763712 instance_id=729f0a53-13c2-4306-96b3-e3eb9e67d4f5 component=gizmosql_server - Instrumentation is disabled
2026-01-27T22:25:34.066875Z INFO pid=2733 tid=124537869763712 instance_id=729f0a53-13c2-4306-96b3-e3eb9e67d4f5 component=gizmosql_server - Running Init SQL command: 
SET autoinstall_known_extensions = true;
2026-01-27T22:25:34.067025Z INFO pid=2733 tid=124537869763712 instance_id=729f0a53-13c2-4306-96b3-e3eb9e67d4f5 component=gizmosql_server - Running Init SQL command: 
 SET autoload_known_extensions = true;
2026-01-27T22:25:34.067172Z INFO pid=2733 tid=124537869763712 instance_id=729f0a53-13c2-4306-96b3-e3eb9e67d4f5 component=gizmosql_server - Running Init SQL command: 
INSTALL spatial;
Error: Invalid: IO Error: Can't find the home directory at ''
Specify a home directory using the SET home_directory='/path/to/dir' option.

It seems the underlying DuckDB needs some settings? Not sure...

@prmoore77
Copy link
Contributor Author

@prmoore77, not having much luck debugging the issue. It seems that when I run benchmark.sh manually, everything works. But when I try to run via run-benchmark.sh and cloud-init.sh on a new EC2 instance, things don't work. Here is the end of the log generated by cloud-init.sh (before these lines it is doing normal apt stuff):

Archive:  gizmosql.zip
  inflating: gizmosql_server         
  inflating: gizmosql_client         
/tmp /ClickBench/gizmosql
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
^M  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0^M  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
^M  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
^M100 46.1M  100 46.1M    0     0  92.9M      0 --:--:-- --:--:-- --:--:-- 92.9M
/ClickBench/gizmosql
Waiting for gizmosql_server to start...

(hangs here)

And here is the gizmosql_server.log:

2026-01-27T22:25:34.002898Z INFO pid=2733 tid=124537869763712 component=gizmosql_server - GizmoSQL Core - Copyright (c) 2026 GizmoData LLC
 Licensed under the Apache License, Version 2.0
 https://www.apache.org/licenses/LICENSE-2.0
2026-01-27T22:25:34.002933Z INFO pid=2733 tid=124537869763712 component=gizmosql_server - Overall Log Level is set to: info
2026-01-27T22:25:34.006338Z INFO pid=2733 tid=124537869763712 component=gizmosql_server - Query timeout (in seconds) is set to: 0 (unlimited)
2026-01-27T22:25:34.006368Z INFO pid=2733 tid=124537869763712 component=gizmosql_server - Query Log Level is set to: INFO
2026-01-27T22:25:34.006380Z INFO pid=2733 tid=124537869763712 component=gizmosql_server - Authentication Log Level is set to: INFO
2026-01-27T22:25:34.006429Z INFO pid=2733 tid=124537869763712 component=gizmosql_server - ----------------------------------------------
2026-01-27T22:25:34.006663Z INFO pid=2733 tid=124537869763712 component=gizmosql_server - System: Ubuntu 24.04.3 LTS (linux/x86_64)
2026-01-27T22:25:34.006679Z INFO pid=2733 tid=124537869763712 component=gizmosql_server - CPU: AMD EPYC 7R13 Processor (16 cores)
2026-01-27T22:25:34.006731Z INFO pid=2733 tid=124537869763712 component=gizmosql_server - Memory: 30.6 GB
2026-01-27T22:25:34.006744Z INFO pid=2733 tid=124537869763712 component=gizmosql_server - Apache Arrow version: 23.0.0
2026-01-27T22:25:34.006794Z INFO pid=2733 tid=124537869763712 component=gizmosql_server - WARNING - TLS is disabled for the GizmoSQL server - this is NOT secure.
2026-01-27T22:25:34.007930Z INFO pid=2733 tid=124537869763712 component=gizmosql_server - Access logging disabled
2026-01-27T22:25:34.007954Z INFO pid=2733 tid=124537869763712 component=gizmosql_server - DuckDB version: v1.4.4
2026-01-27T22:25:34.066661Z INFO pid=2733 tid=124537869763712 instance_id=729f0a53-13c2-4306-96b3-e3eb9e67d4f5 component=gizmosql_server - Server instance created
2026-01-27T22:25:34.066677Z INFO pid=2733 tid=124537869763712 instance_id=729f0a53-13c2-4306-96b3-e3eb9e67d4f5 component=gizmosql_server - Instrumentation is disabled
2026-01-27T22:25:34.066875Z INFO pid=2733 tid=124537869763712 instance_id=729f0a53-13c2-4306-96b3-e3eb9e67d4f5 component=gizmosql_server - Running Init SQL command: 
SET autoinstall_known_extensions = true;
2026-01-27T22:25:34.067025Z INFO pid=2733 tid=124537869763712 instance_id=729f0a53-13c2-4306-96b3-e3eb9e67d4f5 component=gizmosql_server - Running Init SQL command: 
 SET autoload_known_extensions = true;
2026-01-27T22:25:34.067172Z INFO pid=2733 tid=124537869763712 instance_id=729f0a53-13c2-4306-96b3-e3eb9e67d4f5 component=gizmosql_server - Running Init SQL command: 
INSTALL spatial;
Error: Invalid: IO Error: Can't find the home directory at ''
Specify a home directory using the SET home_directory='/path/to/dir' option.

It seems the underlying DuckDB needs some settings? Not sure...

Hi @george-larionov - I think that the env var: HOME must be set for the extensions to install. Could you set that to: /home/ubuntu and retry?

export HOME=/home/ubuntu

@rschu1ze rschu1ze mentioned this pull request Jan 28, 2026
@george-larionov
Copy link
Member

Hi @george-larionov - I think that the env var: HOME must be set for the extensions to install. Could you set that to: /home/ubuntu and retry?

export HOME=/home/ubuntu

@prmoore77 that worked!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add GizmoSQL

4 participants